15 research outputs found

    Secret charing vs. encryption-based techniques for privacy preserving data mining

    Get PDF
    Privacy preserving querying and data publishing has been studied in the context of statistical databases and statistical disclosure control. Recently, large-scale data collection and integration efforts increased privacy concerns which motivated data mining researchers to investigate privacy implications of data mining and how data mining can be performed without violating privacy. In this paper, we first provide an overview of privacy preserving data mining focusing on distributed data sources, then we compare two technologies used in privacy preserving data mining. The first technology is encryption based, and it is used in earlier approaches. The second technology is secret-sharing which is recently being considered as a more efficient approach

    Impossibility of unconditionally secure scalar products

    Get PDF
    The ability to perform scalar products of two vectors, each known to a different party, is a central problem in privacy preserving data mining and other multi-party computation problems. Ongoing search for both efficient and secure scalar product protocols has revealed that this task is not easy. In this paper we show that, indeed, scalar products can never be made secure in the information theoretical sense. We show that any attempt to make unconditionally secure scalar products will inevitably allow one of the parties to learn the other parties input vector with high probability. On the other hand, we show that under various assumptions, such as the existence of a trusted third party or the difficulty of discrete logarithms, both efficient and secure scalar products do exist. We proposed two new protocols for secure scalar products and compare their performance with existing secure scalar products

    Discovering private trajectories using background information

    Get PDF
    Trajectories are spatio-temporal traces of moving objects which contain valuable information to be harvested by spatio-temporal data mining techniques. Applications like city traffic planning, identification of evacuation routes, trend detection, and many more can benefit from trajectory mining. However, the trajectories of individuals often contain private and sensitive information, so anyone who possess trajectory data must take special care when disclosing this data. Removing identifiers from trajectories before the release is not effective against linkage type attacks, and rich sources of background information make it even worse. An alternative is to apply transformation techniques to map the given set of trajectories into another set where the distances are preserved. This way, the actual trajectories are not released, but the distance information can still be used for data mining techniques such as clustering. In this paper, we show that an unknown private trajectory can be reconstructed using the available background information together with the mutual distances released for data mining purposes. The background knowledge is in the form of known trajectories and extra information such as the speed limit. We provide analytical results which bound the number of the known trajectories needed to reconstruct private trajectories. Experiments performed on real trajectory data sets show that the number of known samples is surprisingly smaller than the actual theoretical bounds

    Privacy risks in trajectory data publishing: reconstructing private trajectories from continuous properties

    Get PDF
    Location and time information about individuals can be captured through GPS devices, GSM phones, RFID tag readers, and by other similar means. Such data can be pre-processed to obtain trajectories which are sequences of spatio-temporal data points belonging to a moving object. Recently, advanced data mining techniques have been developed for extracting patterns from moving object trajectories to enable applications such as city traffic planning, identification of evacuation routes, trend detection, and many more. However, when special care is not taken, trajectories of individuals may also pose serious privacy risks even after they are de-identified or mapped into other forms. In this paper, we show that an unknown private trajectory can be reconstructed from knowledge of its properties released for data mining, which at first glance may not seem to pose any privacy threats. In particular, we propose a technique to demonstrate how private trajectories can be re-constructed from knowledge of their distances to a bounded set of known trajectories. Experiments performed on real data sets show that the number of known samples is surprisingly smaller than the actual theoretical bounds

    Improved fuzzy vault scheme for fingerprint verification

    Get PDF
    Fuzzy vault is a well-known technique to address the privacy concerns in biometric identification applications. We revisit the fuzzy vault scheme to address implementation, efficiency, and security issues encountered in its realization. We use the fingerprint data as a case study. We compare the performances of two different methods used in the implementation of fuzzy vault, namely brute force and Reed Solomon decoding. We show that the locations of fake (chaff) points in the vault leak information on the genuine points and propose a new chaff point placement technique that makes distinguishing genuine points impossible. We also propose a novel method for creation of chaff points that decreases the success rate of the brute force attack from 100% to less than 3.5%. While this paper lays out a complete guideline as to how the fuzzy vault is implemented in an efficient and secure way, it also points out that more research is needed to thwart the proposed attacks by presenting ideas for future research

    Elsevier Editorial System(tm) for Data & Knowledge Engineering Title: Impossibility of Unconditionally Secure Scalar Products Impossibility of Unconditionally Secure Scalar Products

    No full text
    Abstract The ability to perform scalar products of two vectors, each known to a different party, is a central problem in privacy preserving data mining and other multi party computation problems. Ongoing search for both efficient and secure scalar product protocols has revealed that this task is not easy. In this paper we show that, indeed, scalar products can never be made secure in the information theoretical sense. We show that any attempt to make unconditionally secure scalar products will always allow one of the parties to learn the other parties input vector with high probability. On the other hand, we show that under various assumptions, such as the existence of a trusted third party, both efficient and secure scalar products do exist

    Interstrain Antigenic Variability of Mumps Viruses

    Get PDF
    Recent concerns about privacy issues motivated data mining researchers to develop methods for performing data mining while preserving the privacy of individuals. However, the current techniques for privacy preserving data mining suffer from high communication and computation overheads which are prohibitive considering even a modest database size. Furthermore, the proposed techniques have strict assumptions on the involved parties which need to be relaxed in order to reflect the real-world requirements. In this paper we concentrate on a distributed scenario where the data is partitioned vertically over multiple sites and the involved sites would like to perform clustering without revealing their local databases. For this setting, we propose a new protocol for privacy preserving k-means clustering based on additive secret sharing. We show that the new protocol is more secure than the state of the art. Experiments conducted on real and synthetic data sets show that, in realistic scenarios, the communication and computation cost of our protocol is considerably less than the state of the art which is crucial for data mining applications

    Securing fuzzy vault schemes through biometric hashing

    No full text
    The fuzzy vault scheme is a well-known technique to mitigate privacy, security, and usability related problems in biometric identification applications. The basic idea is to hide biometric data along with secret information amongst randomly selected chaff points during the enrollment process. Only the owner of the biometric data who presents correct biometrics can recover the secret and identify himself. Recent research, however, has shown that the scheme is vulnerable to certain types of attacks. The recently proposed “correlation attack”, that allows linking two vaults of the same biometric, pose serious privacy risks that have not been sufficiently addressed. The primary aim of this work is to remedy those problems by proposing a framework based on distance preserving hash functions to render the correlation attack inapplicable. We first give definitions which capture the requirements such hash functions must posses. We then propose a specific family of hash functions that fulfills these requirements and lends itself to efficient implementation. We also provide formal proofs that the proposed family of hash functions indeed protects the fuzzy vault against correlation attacks. We implement a hashed fuzzy vault using fingerprint data and investigate the effects of the proposed method on the false accept and false reject rates (FAR and FRR, respectively) extensively. Implementation results suggest that the proposed method provides a complete protection against correlation attacks at the expense of small degradation in the FRR
    corecore